The sample complexity of agnostic learning with deterministic labels
نویسندگان
چکیده
We investigate agnostic learning when there is no noise in the labeling function, that is, the labels are deterministic. We show that in this setting, in contrast to the fully agnostic learning setting (with possibly noisy labeling functions), the sample complexity of learning a binary hypothesis class is not fully determined by the VCdimension of the class. For any d, we present classes of VC-dimension d that are learnable from O(d/ ) many samples and classes that require samples of sizes Ω(d/ ). Furthermore, we show that in this setting, there exist classes where ERM algorithms are not optimal: While the class can be learned with sample complexity O(d/ ), the convergence rate of any ERM algorithm is only Ω(d/ ). We introduce a new combinatorial parameter of a class of binary valued functions and show that it provides a full combinatorial characterization of the sample complexity of deterministic label agnostic learning of a class.
منابع مشابه
The sample complexity of agnostic learning under deterministic labels
With the emergence of Machine Learning tools that allow handling data with a huge number of features, it becomes reasonable to assume that, over the full set of features, the true labeling is (almost) fully determined. That is, the labeling function is deterministic, but not necessarily a member of some known hypothesis class. However, agnostic learning of deterministic labels has so far receiv...
متن کاملLearning by Refuting
The sample complexity of learning a Boolean-valued function class is precisely characterized by its Rademacher complexity. This has little bearing, however, on the sample complexity of efficient agnostic learning. We introduce refutation complexity, a natural computational analog of Rademacher complexity of a Boolean concept class and show that it exactly characterizes the sample complexity of ...
متن کاملAgnostic Learning by Refuting∗
The sample complexity of learning a Boolean-valued function class is precisely characterized by its Rademacher complexity. This has little bearing, however, on the sample complexity of efficient agnostic learning. We introduce refutation complexity, a natural computational analog of Rademacher complexity of a Boolean concept class and show that it exactly characterizes the sample complexity of ...
متن کاملProbabilistic Lipschitzness A niceness assumption for deterministic labels
We present Probabilistic Lipschitzness (PL), a notion of marginal label relatedness that is particularly useful for modeling niceness of distributions with deterministic labeling functions. We present convergence rates for Nearest Neighbor learning under PL. We further summarize reductions in labeled sample complexity for learning with unlabeled data (semi-supervised and active learning) under PL.
متن کاملActive Learning: Disagreement Coefficient
In previous lectures we saw examples in which active learning gives an exponential improvement in the number of labels required for learning. In this lecture we describe the Disagreement Coefficient —a measure of the complexity of an active learning problem proposed by Steve Hanneke in 2007. We will derive an algorithm for the realizable case and analyze it using the disagreement coefficient. I...
متن کامل